Solution: Store a Salted Hash of the Password
Let’s learn how we should save the passwords in our database.
We'll cover the following
The chief problem in the antipattern Readable Passwords is that the original form of the password is readable. But we can authenticate the user’s input against a password without reading it. This section describes how to implement this kind of secure password storage in an SQL database.
Understanding hash functions#
We can do this by encoding the password using a one-way cryptographic hash function. This transforms the input string into a new string, called hash, which is unrecognizable. Even the length of the original string is obscured because the hash returned by a hash function is a fixed-length string. For example, the SHA-256
algorithm converts our example password, “xyzzy”, to a 256-bit string of bits, usually represented as a 64-character string of hexadecimal digits:
SHA2('xyzzy')
= '184858a00fd7971f810848266ebcecee5e8b69972c5ffaed622f5ee078671aed'
Another characteristic of a hash is that it’s not reversible. We can’t recover the input string from its hash because the hashing algorithm is designed to “lose” some information about the input. A good hashing algorithm should take as much work to crack as it would to simply guess the input through trial and error.
A popular algorithm in the past has been SHA-1
, but researchers have recently proved that this 160-bit hashing algorithm has insufficient cryptographic strength so that it is possible to infer the input from a hash string. The technique to infer the encrypted string is very time-consuming but it takes less time than guessing the password by trial and error. The National Institute of Standards and Technology (NIST) announced a plan to phase out SHA-1
as an approved secure hashing algorithm in the U.S. after 2010 in favor of these stronger variants: SHA-224
, SHA2 256
, SHA-384
, and SHA-512
. Whether we need to comply with NIST standards or not, it’s a good idea to use at least SHA-256
for passwords.
The MD5()
function is another popular hash function, producing hash strings of 128 bits. This function has also been shown to be cryptographically weak, so we shouldn’t use it for encoding passwords. Weaker algorithms still have been used but not for sensitive information like passwords.
Using a hash in SQL#
The following is a redefinition of the Accounts
table. The SHA-256
password hash is always 64 characters long, so we define the column as a fixed-length CHAR
column of that length.
Hashing functions aren’t part of the standard SQL language, so we may need to rely on our database brand to support hashing as an extension. For example, MySQL 6.0.5 with SSL support includes a function SHA2()
, which returns a 256-bit hash by default.
We can validate a user’s input by applying the same hash function to it and comparing the result to the value stored in the database.
We can also lock an account easily by changing the value in the password hash to a string the hash function can’t return. For example, the string “noaccess” contains letters that aren’t hexadecimal digits.
Adding salt to our hash#
If we store hashes instead of passwords and the attacker gains access to our database (by searching our trash for a CDROM backup, for example), they can still attempt to guess passwords by trial and error. Guessing each password may take a long time, but they can prepare their own database of hashes of likely passwords against which to compare the hash strings they find in our database. If only one user chose a password that is a word in a standard dictionary, it would be easy for them to find it by simply searching their password database for hashes that match their prepared table of hashes.
They can even do this with SQL:
- The attacker would have prepared a
DictionaryHashes
table.
- The hacker would have a similar entry to the one shown below in their table:
- The hacker would match the string in our table with the data in their table saved using
SHA2
.
One way to defeat this kind of “dictionary attack” is by including a “salt” in our password-encoding expression. A salt is a string of meaningless bytes we concatenate with the user’s password, before passing the resulting string to the hash function. Even if the user chose a word in the dictionary as their password, the hash produced from a salted password won’t match the hash in the attacker’s hash database. For example, if the password is the word “password”, we can see that the hash of this word is different from a hash of the word with a few random bytes appended:
SHA2('password')
= '5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8'
SHA2('password-0xT!sp9')
= '7256d8d7741f740ee83ba7a9b30e7ac11fcd9dbd7a0147f4cc83c62dd6e0c45b'
Each password should use a different salt value to make an attacker have to generate a new dictionary table of hashes for each password. But then the attacker is back to square one because cracking passwords in your database takes as much time as guessing them with trial and error.
A good salt is 8 bytes long, generated randomly for each password. The previous examples show a salt string containing printable characters, but we can (and should) make a salt using printable and unprintable bytes.